A new variance-based approach for discriminative feature extraction in machine hearing classification using spectrogram features

نویسندگان

  • Zhipeng Xie
  • Ian Vince McLoughlin
  • Haomin Zhang
  • Yan Song
  • Wei Xiao
چکیده

Machine hearing is an emerging research field that is analogous to machine vision in that it aims to equip computers with the ability to hear and recognise a variety of sounds. It is a key enabler of natural human-computer speech interfacing, as well as in areas such as automated security surveillance, environmental monitoring, smart homes/buildings/cities. Recent advances in machine learning allow current systems to accurately recognise a diverse range of sounds under controlled conditions. However doing so in real-world noisy conditions remains a challenging task. Several front-end feature extraction methods have been used for machine hearing, employing speech recognition features like MFCC and PLP, as well as image-like features such as AIM and SIF. The best choice of feature is found to be dependent upon the noise environment and machine learning techniques used. Machine learning methods such as deep neural networks have been shown capable of inferring discriminative classification rules from less structured front-end features in related domains. In the machine hearing field, spectrogram image features have recently shown good performance for noise-corrupted classification using deep neural networks. However there are many methods of extracting features from spectrograms. This paper explores a novel data-driven feature extraction method that uses variance-based criteria to define spectral pooling of features from spectrograms. The proposed method, based on maximising the pooled spectral variance of foreground and background sound models, is shown to achieve very good performance for robust classification.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

دو روش تبدیل ویژگی مبتنی بر الگوریتم های ژنتیک برای کاهش خطای دسته بندی ماشین بردار پشتیبان

Discriminative methods are used for increasing pattern recognition and classification accuracy. These methods can be used as discriminant transformations applied to features or they can be used as discriminative learning algorithms for the classifiers. Usually, discriminative transformations criteria are different from the criteria of  discriminant classifiers training or  their error. In this ...

متن کامل

Mental Arithmetic Task Recognition Using Effective Connectivity and Hierarchical Feature Selection From EEG Signals

Introduction: Mental arithmetic analysis based on Electroencephalogram (EEG) signal for monitoring the state of the user’s brain functioning can be helpful for understanding some psychological disorders such as attention deficit hyperactivity disorder, autism spectrum disorder, or dyscalculia where the difficulty in learning or understanding the arithmetic exists. Most mental arithmetic recogni...

متن کامل

Automatic Face Recognition via Local Directional Patterns

Automatic facial recognition has many potential applications in different areas of humancomputer interaction. However, they are not yet fully realized due to the lack of an effectivefacial feature descriptor. In this paper, we present a new appearance based feature descriptor,the local directional pattern (LDP), to represent facial geometry and analyze its performance inrecognition. An LDP feat...

متن کامل

Heart Rate Variability Classification using Support Vector Machine and Genetic Algorithm

Background: Electrocardiogram (ECG) is defined as an electrical signal, which represents cardiac activity. Heart rate variability (HRV) as the variation of interval between two consecutive heartbeats represents the balance between the sympathetic and parasympathetic branches of the autonomic nervous system.Objective: In this study, we aimed to evaluate the efficiency of discrete wavelet transfo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Digital Signal Processing

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2016